Data processing device

ABSTRACT

A data processor includes: a parser receiving a data stream and separating, from the stream, video data including data of pictures, a first time information value showing a presentation time of each picture, audio data including data of frames, and a second time information value showing a presentation time of each frame; an output section for outputting the video and audio data to play back the pictures and frames; a reference register for counting amount of time passed since a predetermined reference time; a first differential register for calculating a difference between the first time information value of each picture and the time passed and holding it as a first differential value when the picture is output; a second differential register for calculating the difference between the second time information value of each frame and the time passed and holding it as a second differential value when the frame is output; and a control section for controlling the output of the video data according to the magnitude of the first differential value to that of the second differential value.

TECHNICAL FIELD

The present invention relates to a technique of playing back audio andvideo by decoding compressed and encoded audio and video data.

BACKGROUND ART

Recently, various methods for cutting down the data sizes of video andaudio data, which would take a long time to play back, by compressingand encoding them before writing them on a storage medium have beendeveloped. In International Organization for Standardization (ISO),Moving Picture Image Coding Experts Group (MPEG) of InternationalElectrotechnical Commission (IEC) has been standardizing audio and videocoding methods. For example, a video compression method was defined inISO/IEC 13818-2, an audio compression method was defined in ISO/IEC13818-3, and a method for synthesizing them was defined in ISO/IEC13818-1. The last method mentioned is known as the “MPEG Systemstandard”. By using these compression coding techniques, a data streamrepresenting video and audio that would take a long time to play backsuch as a movie (i.e., an MPEG system stream) can now be stored on asingle storage medium (e.g., an optical disk) while maintaining its highquality.

Meanwhile, methods for storing those data on a storage medium have beenbeing standardized, too. For instance, a DVD standard called “DVDSpecification for Read-Only Disc Version 1.0” is known. Also, the DVDVideo Recording standard “DVD Specifications forRewritable/Re-recordable Discs” was defined in September 1999 as astandard for recording video and audio on a storage medium.

Hereinafter, processing for playing back video and audio synchronouslywith each other from a data stream on a storage medium will bedescribed. FIG. 1 shows an arrangement of functional blocks in aconventional player 10 that can play back a system stream. In thisexample, the system stream is supposed to be a program stream withinformation about system clock reference SCR.

The player 10 includes an AV parser 1, a video decoder 2, an audiodecoder 3 and an STC register 4.

The AV parser 1 receives a system stream that has been providedexternally and breaks that system stream into audio data and video data.In the following description, a data stream representing audio will bereferred to herein as an “audio stream”, while a data streamrepresenting video will be referred to herein as a “video stream”. Also,the AV parser 1 extracts system clock reference SCR, an audiopresentation time stamp APTS and a video presentation time stamp VPTSfrom the system stream. The AV parser 1 sets a reference value for theSTC register 4 based on the system clock reference SCR, and outputs thevideo stream and VPTS to the video decoder 2 and the audio stream andAPTS to the audio decoder 3, respectively. In response, the STC register4 generates a sync signal STC based on the reference value.

The video decoder 2 decodes the video stream by reference to the syncsignal STC and video decoding time stamp VDTS and then outputs thedecoded video data at a timing when the sync signal STC matches theVPTS. In the NTSC standard, for example, video presentation time stampsVPTS are added to video at an interval corresponding to about 16.7 ms soas to be synchronized with the times at which field pictures arepresented. Also, since 30 video frames are presented per second and oneframe consists of two fields according to the NTSC standard, each fieldis refreshed approximately every 16.7 ms.

On the other hand, the audio decoder 3 decodes the video stream byreference to the sync signal STC and audio decoding time stamp ADTS andthen outputs the decoded audio data at a timing when the sync signal STCmatches the APTS. For example, audio presentation time stamps APTS areadded to audio at an interval corresponding to the audio frame playbacktiming of about 32 ms.

By performing these processing steps, audio and video can be played backsynchronously with each other at the timings that were intended by themaker of the system stream during encoding.

In this example, the sync signal STC is supposed to be generated byreference to the system clock reference SCR. The same reference is usedwhen a digital broadcast is received in real time and clock signals ontransmitting and receiving ends need to be synchronized with each other.If the digital broadcast is a transport stream, however, a program clockreference PCR is used.

Meanwhile, in playing back video and audio by reading out a systemstream that has already been stored on a storage medium such as anoptical disk, it is not always necessary to reproduce the clock signalat the time of encoding by reference to the system clock reference SCR.Alternatively, the sync signal STC may also be set by using the audiopresentation time stamps APTS, for example. Thus, an example of suchplayback will be described with reference to FIG. 2.

FIG. 2 shows an arrangement of functional blocks in another conventionalplayer 20. The player 20 decodes an audio stream and a video stream froma system stream stored on a storage medium, and outputs videosynchronously with audio by reference to audio presentation time stampsAPTS. Such a player 20 is disclosed in Japanese Patent ApplicationLaid-Open Publication No. 10-136308, for example.

An AV separating section 12 reads a digitally encoded system stream froma data storage device 11 and separates audio and video data that arestored there after having been multiplexed.

A video processing section 13 decodes the video data and sends videoheader information, obtained during the decoding process, to a delaydetecting section 16. The video presentation time stamps VPTS aredescribed in the header information. Also, the video processing section13 saves the total number of frames of the video data that has everbeen, played back since the start of the playback on a video framecounter 18. An audio processing section 14 decodes audio data and sendsaudio header information, obtained during the decoding process, to aclock generating section 17. The audio presentation time stamps VPTS aredescribed in the header information. Also, the audio processing section14 saves the total amount of the audio data that has ever been playedback since the start of the playback on an audio data counter 19.

The clock generating section 17 calculates a reference time, which isshown as audio playback duration, based on the total amount of datasaved in the audio data counter 19 and the audio header informationobtained from the audio processing section 14. The delay detectingsection 16 calculates the ideal number of frames of the video data thatshould be output in accordance with the information about the referencetime obtained from the clock generating section 17 and the video headerinformation received from the video processing section 13. Also, thedelay detecting section 16 compares the ideal number of frames with theactual number of frames obtained by the video frame counter, therebysensing how the video playback is coming along with the audio playback.

If the delay detecting section 16 has sensed that the video output isbehind the audio output, then a frame skipping control section 15determines frames not to output (i.e., frames to skip) and provides theAV separating section 12 and the video processing section 13 with thatinformation. The video processing section 13 omits the output of thoseframes to skip but outputs their succeeding frames. As a result, thevideo delay can be cut down by an amount of time corresponding to theplayback duration of one frame (e.g., 33 ms in NTSC) and the videooutput is no longer trailing behind the audio output. The player 20 canplay back audio and video synchronously with each other by such atechnique.

The video playback is defined with respect to the audio playback asfollows. Suppose the “ideal state” is a state in which video and audioare played back at the timings that were originally intended by themaker during encoding. Generally speaking, if the video is played backwithin a time frame of −50 ms and 30 ms from the ideal state (i.e., withrespect to the audio playback), then a person senses that the audio andvideo are synchronous with each other. Accordingly, if the videopresentation time falls within this permissible range with respect tothe audio presentation time, then the video output may be judged as nottrailing behind the audio output. Otherwise, the video output may bejudged as trailing behind the audio output.

However, if the conventional player played back the video by referenceto the audio playback duration, then the following problems would arise.

Specifically, in determining whether or not audio and video are beingplayed back synchronously with each other, the conventional player mightjudge that the audio and video time lag has exceeded its permissiblerange, even though the time lag actually falls within the permissiblerange.

For example, suppose the video is being played back 20 ms behind (whichdelay falls within the permissible range) the audio. As described above,the APTS is added to an audio frame approximately every 32 ms and theVPTS is added to a video field approximately every 16.7 ms. That is whyaccording to a timing at which the VPTS is compared with the APTS, thevideo might be played back at most 52 ms behind or ahead of the audio.Particularly if the VPTS and APTS should be compared with each otheronly just after the VPTS has been updated, then the video would beplayed back +20 ms to +52 ms later than the audio. Thus, if the lagfalls within the range of +30 ms to +52 ms, then the viewer would feeluncomfortable to see the video and audio played back non-synchronously.

Also, if the audio playback ended earlier than the video playback, thenthe conventional player could no longer continue the video playbackafter that. This is because once the audio playback has ended, the audioplayback duration is not counted anymore and the audio can no longerfunction as a time reference for playing back the video. In a systemstream, in particular, audio data and video data are included as amixture, and the presentation time of the audio data does not alwaysmatch that of the video data. Accordingly, the audio data and video dataobtained at the end of the data reading operation will finish beingplayed back at mutually different times. And the audio playback may endearlier than the video playback, thus causing various inconveniences.

Thus, an object of the present invention is to synchronize audio andvideo with each other just as intended when the video is played back byreference to the audio presentation time. Another object of the presentinvention is to play back the video continuously even if the audioplayback has ended earlier than the video playback.

DISCLOSURE OF INVENTION

A data processor according to the present invention includes: a parser,which receives a data stream and which separates, from the data stream,video data including data of a plurality of pictures, a first timeinformation value showing a presentation time of each of those pictures,audio data including data of a plurality of audio frames, and a secondtime information value showing a presentation time of each of thoseaudio frames; an output section for outputting the video data and theaudio data to play back the pictures and the audio frames; a referenceregister for counting amount of time that has passed since apredetermined reference time; a first differential register forcalculating a difference between the first time information value ofeach said picture and the amount of time passed and holding thedifference as a first differential value when the picture is output; asecond differential register for calculating a difference between thesecond time information value of each said audio frame and the amount oftime passed and holding the difference as a second differential valuewhen the audio frame is output; and a control section for controllingoutput of the video data according to the magnitude of the firstdifferential value with respect to that of the second differentialvalue.

The reference register may count the amount of time passed by setting atime when the audio data starts to be output as the predeterminedreference time.

When finding the second differential value greater than a sum of thefirst differential value and a first prescribed value, the controlsection may instruct the output section to output the video data of thepicture being output at that point in time again.

When finding the second differential value smaller than the differencebetween the first differential value and a second prescribed value, thecontrol section may instruct the output section to skip output of thevideo data of the picture.

On receiving an instruction to set a pause on playback, the controlsection may control the output of the video data by reference to thefirst and second time information values.

On receiving an instruction to cancel the pause on playback, the controlsection may instruct the output section to output the audio data earlierthan the video data if the second time information value is greater thana sum of the first time information value and a third prescribed value.

On receiving an instruction to cancel the pause on playback, the controlsection may instruct the output section to output the video data earlierthan the audio data if the second time information value is smaller thanthe difference between the first time information value and a fourthprescribed value.

On receiving an instruction to cancel the pause on playback, the controlsection may instruct the output section to output the video data and theaudio data simultaneously.

A data processing method according to the present invention includes thesteps of: receiving a data stream; separating, from the data stream,video data including data of a plurality of pictures, a first timeinformation value showing a presentation time of each of those pictures,audio data including data of a plurality of audio frames, and a secondtime information value showing a presentation time of each of thoseaudio frames; outputting the video data and the audio data to play backthe pictures and the audio frames; counting amount of time that haspassed since a predetermined reference time; calculating a differencebetween the first time information value of each said picture and theamount of time passed and holding the difference as a first differentialvalue when the picture is output; calculating a difference between thesecond time information value of each said audio frame and the amount oftime passed and holding the difference as a second differential valuewhen the audio frame is output; and controlling output of the video dataaccording to the magnitude of the first differential value with respectto that of the second differential value.

The step of counting may include counting the amount of time passed bysetting a time when the audio data starts to be output as thepredetermined reference time.

If the second differential value is greater than a sum of the firstdifferential value and a first prescribed value, the step of outputtingmay include outputting the video data of the picture being output atthat point in time again.

If the second differential value found smaller than the differencebetween the first differential value and a second prescribed value, thestep of outputting may include skipping output of the video data of thepicture.

The data processing method may further include the step of receiving aninstruction about a pause on playback, and the step of outputting mayinclude controlling the output of the video data by reference to thefirst and second time information values in accordance with theinstruction.

The data processing method may further include the step of receiving aninstruction about a pause on playback, and the step of outputting mayinclude outputting the audio data earlier than the video data inaccordance with the instruction if the second time information value isgreater than a sum of the first time information value and a thirdprescribed value.

The data processing method may further include the step of receiving aninstruction about a pause on playback, and the step of outputting mayinclude outputting the video data earlier than the audio data inaccordance with the instruction if the second time information value issmaller than the difference between the first time information value anda fourth prescribed value.

The data processing method may further include the step of receiving aninstruction about a pause on playback, and the step of outputting mayinclude outputting the video data and the audio data simultaneously inaccordance with the instruction.

A data processing program according to the present invention isexecutable by a computer. Following this program, a data processor witha built in computer carries out the processing steps of: receiving adata stream; separating, from the data stream, video data including dataof a plurality of pictures, a first time information value showing apresentation time of each of those pictures, audio data including dataof a plurality of audio frames, and a second time information valueshowing a presentation time of each of those audio frames; outputtingthe video data and the audio data to play back the pictures and theaudio frames; counting amount of time that has passed since apredetermined reference time; calculating a difference between the firsttime information value of each said picture and the amount of timepassed and holding the difference as a first differential value when thepicture is output; calculating a difference between the second timeinformation value of each said audio frame and the amount of time passedand holding the difference as a second differential value when the audioframe is output; and controlling output of the video data according tothe magnitude of the first differential value with respect to that ofthe second differential value.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows an arrangement of functional blocks in a conventionalplayer 10 that can play a system stream.

FIG. 2 shows an arrangement of functional blocks in another conventionalplayer 20.

FIG. 3 shows a data structure for an MPEG2 program stream 30 compliantwith the DVD Video Recording standard.

FIG. 4 shows the data structure of a video pack 32.

FIG. 5 shows an arrangement of functional blocks in a data processor 50according to a preferred embodiment of the present invention.

FIG. 6 is a flowchart showing the procedure of processing done by thevideo decoder.

FIG. 7 is a flowchart showing the procedure of processing done by theaudio decoder 61.

FIG. 8 is a flowchart showing the procedure of processing done by thedata processor 50.

FIG. 9 is a graph showing how the values retained in the VPTS register56, VPTS differential register 57, APTS register 58 and APTSdifferential register 59 change.

FIG. 10 is a graph showing how the values retained in the registers 56,57, 58 and 59 will change in a situation where the output of the audioframe has finished earlier than that of the video frame.

FIG. 11 is a graph showing how the values retained in the registers 56,57, 58 and 59 will change if a pause has been set on the playback.

FIG. 12 is a flowchart showing the procedure of processing done by thedata processor 50 when the playback pause is cancelled.

FIG. 13 is a block diagram showing an alternative configuration for thedata processor.

BEST MODE FOR CARRYING OUT THE INVENTION

In the following description, the data structure of a data stream willbe described first, and then a data processor according to a preferredembodiment for processing that data stream will be described.

FIG. 3 shows a data structure for an MPEG2 program stream 30 compliantwith the DVD Video Recording standard (which will be simply referred toherein as a “data stream 30”).

The data stream 30 includes a plurality of video objects (VOBs) #1, #2,. . . , and #k. Supposing the data stream 30 is a content that was shotwith a camcorder, for example, each VOB stores moving picture data thatwas generated during a single video recording session (i.e., since theuser started recording the video and until he or she stopped doing it).

Each VOB includes a plurality of VOB units (video object units; VOBUs)#1, #2, . . . , and #n. Each VOBU is a data unit containing video datain an amount corresponding to a video playback duration of 0.4 second to1 second.

Hereinafter, the data structure of VOBUs will be described with thefirst and second VOBUs #1 and #2 in FIG. 3 taken as an example.

VOBU #1 is composed of a number of packs, which belong to thelowest-order layer of the MPEG program stream. In the data stream 30,each pack has a fixed data length (also called a “pack length”) of 2kilobytes (i.e., 2,048 bytes). At the top of the VOBU, a real timeinformation pack (RDI pack) 11 is positioned as indicated by “R” inFIG. 1. The RDI pack 31 is followed by multiple video packs “V”(including a video pack 32) and multiple audio packs “A” (including anaudio pack 33). It should be noted that if the video data has a variablebit rate, the data size of each VOBU is changeable within a rangedefined by a maximum read/write rate even if the playback duration isthe same. However, if the video data has a fixed bit rate, the data sizeof each VOBU is substantially constant.

Each pack stores the following information. Specifically, the RDI pack31 stores various information for controlling the playback of the datastream 30, e.g., information representing the playback timing of theVOBU and information for controlling copying of the data stream 30. Thevideo packs 32 store MPEG2-compressed video data thereon. The audiopacks 33 store audio data that was compressed so as to comply with theMPEG2 Audio standard, for example.

VOBU #2 is also made up of a plurality of packs. An RDI pack 34 islocated at the top of VOBU #2, and then followed by a plurality of videopacks 35 and a plurality of audio packs 36. The contents of theinformation to be stored in these packs are similar to those of VOBU #1.

Video and audio are decoded and played back on a so-called “access unit”basis. The access unit is one video frame for video and one audio framefor audio. According to the MPEG System standard, time stamps forspecifying the time to decode and the time to play back are added toeach access unit. That is to say, there are two time stamps including adecoding time stamp (DTS) showing when to decode and a presentation timestamp (PTS) showing when to play back.

Generally speaking, a player that can play a system stream compliantwith the MPEG System standard, including the data stream 30, generates async signal STC (system time clock) internally, which will be used as areference to synchronize the operation of a decoder. If the systemstream is a program stream, the sync signal STC is generated byreference to the value of the system clock reference SCR added to thestream. On the other hand, if the system stream is a transport stream,the sync signal STC is generated by reference to the value of theprogram clock reference PCR added to the stream.

When the time shown by the sync signal STC agrees with the decoding timeinformation DTS, the decoder of the player decodes an access unit towhich that DTS is added. Thereafter, when the time shown by the syncsignal STC agrees with the presentation time information PTS, thedecoder outputs an access unit to which that PTS is added. As a result,video and audio are played back synchronously with each other at thetiming intended by the designer of the data stream 30.

Next, the data structure of a video pack will be described. FIG. 4 showsthe data structure of the video pack 32. The video pack 32 includes avideo packet 41 and a padding packet 42. The padding packet 42 isprovided to adjust the pack length of a data pack. Thus, no paddingpackets are provided if there is no need to adjust the pack length. Inthat case, only the video packet 41 will be included in the video pack32.

The video packet 41 includes a pack header (Pack_H) of 14 bytes, asystem header (system_H) of 24 bytes, a packet header (Packet_H) 41 aand a payload, which are arranged in this order from the top. In thepack header, information showing the type of the pack (i.e., a videopacket in this case) is described. The system header is always added tothe first pack of each VOBU. The packet header 41 a will be described indetail later. And in the payload, compressed and encoded video data isdescribed.

Meanwhile, the padding packet 42 includes a packet header (Packet_H) 42a and padding data 42 b. In the packet header 42 a, not only informationshowing the identity as a padding packet but also the data length (bytelength) of the padding packet 42 a are described. The data length isdescribed in the field of fifth and sixth bytes (PES_packet_length). Apredetermined value is stored as the padding data 42 b. This value maybe a series of meaningless values “0xFF (hexadecimal number)”. Theamount of the padding data 42 b included is determined so as to adjustthe pack length of the video pack 32 to 2,048 bytes as described above.

Next, the data structure of the packet header 41 a of the video packet41 will be described. The packet header 41 a includes a packet lengthfield 43, a flag field 44 and a header data length field 45. Dependingon the values of a time flag field 44 a and a PES extension flag field44 b, the packet header 41 a may further include an additional field 46.

In the packet length field 43, a packet length (byte length) as measuredfrom that field through the end of the video packet 41 is described.Accordingly, if there is any padding packet 42, the video packet 41 hasa shorter packet length and a smaller packet length value is describedin the packet length field 43. The next flag field 44 includes a timeflag field (PTS_DTS_flag) 44 a and a PES extension flag field(PES_extension_flag) 44 b. In the time flag field 44 a, a flag showingwhether or not there are a presentation time stamp (PTS) and a decodingtime stamp (DTS) is described as will be mentioned later. In the PESextension flag field 44 b, a flag showing whether or not there is a PESextension field is described as will be mentioned later. And in theheader data length field 45, the sum of the field lengths of theadditional field 46 and a stuffing byte field 49 is stored.

Next, the additional field 46 will be described. For example, if thetime flag field 44 a shows that there are both PTS and DTS, one of PTSand DTS fields 47, each having a length of 5 bytes, is provided as theadditional field 46. The PTS is information about the presentation timeof video data, while the DTS is information about the decoding time.Depending on the value of the time flag field 44 a, one of these twofields is provided.

Also, a PES extension field 48 may be provided as the additional field46. In the PES extension field 48, information required for decoding theprogram stream 30, e.g., the capacity of a decoding data buffer, isdescribed.

In the data stream 30, the PES extension field 48 is provided for thefirst video pack and the first audio pack in each VOBU. The PESextension field 48 may be present if the PES extension flag field 44 bis one but absent if the PES extension flag field 44 b is zero, forexample.

The packet header 41 a sometimes includes a stuffing byte field 49. Inthe stuffing byte field 49, stuffing bytes are stored to adjust the packlength. The stuffing bytes are byte data such as meaningless “0xFF”(hexadecimal number). The stuffing byte field 49 and padding packet 42are provided for the same purpose of adjusting the pack length.Accordingly, conditions that the stuffing bytes are no greater than 7bytes and that the stuffing bytes 49 and the padding packet 42 cannot beprovided in the same pack are defined according to the DVD Videostandard. In the example illustrated in FIG. 4, since the padding packet42 is included in the video pack 32, the length of the stuffing bytefield 49 is zero bytes. That is to say, no stuffing byte fields areprovided.

The data structure of the video pack is shown in FIG. 4. The audio packmay have a similar data structure. Thus, the same statement applies tothe audio pack just by replacing the “video packet” with an “audiopacket” and the “video data” stored in the payload with “audio data”.Accordingly, in the packet header 41 a of the audio packet, a PTS/DTSfield 47 is included in the additional field 46 to describe a PTS or aDTS showing its presentation (or output) time. The audio presentationtime stamp (PTS) will be abbreviated herein as “APTS”, while the videopresentation time stamp (PTS) will be abbreviated herein as “VPTS”.

Hereinafter, a data processor will be described with reference to FIG.5, which shows an arrangement of functional blocks in a data processor50 according to this preferred embodiment. In this preferred embodiment,the data processor 50 manages the output timing of a video frame byreference to the presentation time of an audio frame. Thus, thispreferred embodiment is different from the processing done by the player10 shown in FIG. 1 in which the STC register 4 manages both the videoand audio outputs alike. It should be noted that the time when the videoand audio frame data are output is supposed herein to be identical withthe time when those frames are played back as video and output as audio.

In a nutshell, the data processor 50 operates as follows. First, thedata processor 50 reads a data stream 30 from a storage medium, andseparates compressed and encoded video data including the data of aplurality of pictures, the VPTS of those pictures, audio data includingthe data of a plurality of frames, and the APTS of those frames from thedata stream 30. Then, the data processor 50 decodes the video data andthe audio data to play back those pictures and frames.

In parallel with this processing, the data processor 50 counts theamount of time that has passed since a reference time (e.g., theplayback start time of audio). In outputting each picture, the dataprocessor 50 calculates a differential value between the VPTS of thatpicture and the amount of time passed. In outputting each audio frame onthe other hand, the data processor 50 calculates a differential valuebetween the APTS of that audio frame and the amount of time passed. Andthe data processor 50 controls the playback and output of the video dataaccording to the magnitudes of the video and audio differential values.For example, if the video differential value is greater than the audiodifferential value by a predetermined quantity or more, then the dataprocessor 50 skips the output of a single picture. It should be notedthat the “picture” is herein a notion representing both a frame and afield alike.

Hereinafter, respective components of the data processor 50 will bedescribed before it is described how the data processor 50 operates.

The data processor 50 includes an optical pickup 51, a playbackprocessing section 52, an AV parser 53, a reference counter register 54,an AV synchronization control section 55, a VPTS register 56, a VPTSdifferential register 57, an APTS register 58, an APTS differentialregister 59, a video decoder 60 and an audio decoder 61.

The optical pickup 51 reads a data stream 30 from a storage medium,which may be an optical disk such as a DVD or a Blu-ray Disc (BD), forexample. The playback processing section 52 converts the data stream 30,obtained as an analog signal, into digital data and then outputs it.

The AV parser 53 breaks the data stream 30 into a video stream and anaudio stream. Also, the AV parser 53 extracts not only an SCR from thesystem stream but also video presentation time stamps VPTS from thevideo packs 32, 35, etc. of the data stream 30 and audio presentationtime stamps APTS from the audio packs 33, 36, etc, respectively.

The video decoder 60 decodes the video stream, thereby generating videodata. More specifically, when the value shown by the time stamp DTS andthe reference counter value shown by the reference counter register 54either agree with each other or have a predetermined difference, thevideo decoder 60 decodes the picture data to which the DTS is added. Andwhen the VPTS value and the reference counter value either agree witheach other or have a predetermined difference, the video decoder 60outputs picture data to which that PTS is added. Also, on receiving aninstruction to skip the output of a frame, the video decoder 60 skipsthe output of that frame and resumes outputting the following frames andon.

The audio decoder 61 decodes the audio stream, thereby generating audiodata. And the audio decoder 61 decodes the audio frame data and outputsthe audio data.

The reference counter register 54 counts the amount of time passed byreference to the audio data output timing, in other words, as fast asthe time passing since the playback reference time. The amount of timepassed is supposed to be represented at a precision of 90 kHz, forexample. With this precision, the reference time can be updated at asufficiently shorter interval than a video frame or an audio frame.

The VPTS register 56 holds VPTS representing video presentation timestamps. The VPTS differential register 57 holds the differential valuebetween the VPTS value retained in the VPTS register 56 and the value ofthe reference counter register 54 (i.e., a VPTS differential value). TheAPTS register 58 holds APTS representing audio presentation time stamps.The APTS differential register 59 holds the differential value betweenthe APTS value retained in the APTS register 58 and the value of thereference counter register 54 (i.e., an APTS differential value). Oncereceived a certain value, each of the VPTS register 56, VPTSdifferential register 57, APTS register 58 and APTS differentialregister 59 keeps holding that value until the next value is input.

The AV synchronization control section 55 compares the values of theVPTS differential register 57 and APTS differential register 58 witheach other, and controls the video decoder 60 so as to adjust the videodata played and output. In this manner, the video is synchronized withthe audio.

Hereinafter, it will be described more fully how the video decoder 60and the audio decoder 61 operate. FIG. 6 shows the procedure ofprocessing done by the video decoder. On receiving an instruction onwhat video frame to output from the AV synchronization control section55 in Step S61, the video decoder 60 decodes and outputs that frame inStep S62. Next, in Step S63, the video decoder 60 writes a VPTS value,associated with the video frame output, onto the VPTS register 56.Thereafter, in Step S64, the video decoder 60 subtracts the countervalue, stored in the reference counter register 54 at that point intime, from the value held in the VPTS register 56 and then writes theresultant VPTS differential value onto the VPTS differential register57. By performing these processing steps, the video decoder 60 updatesthe values retained in the VPTS register 56 and in the VPTS differentialregister 57.

Optionally, those values may be written on the VPTS register 56 and VPTSdifferential register 57 before the video is output. Also, the values ofthe VPTS register 56 and the VPTS differential register 57 may beupdated with the VPTS value of the next frame while the video is beingoutput.

Meanwhile, FIG. 7 shows the procedure of processing done by the audiodecoder 61. First, in Step S71, the audio decoder 61 decodes and outputsan audio stream. Next, in Step S72, the audio decoder 61 writes an APTSvalue, associated with the audio frame output, onto the APTS register.Thereafter, in Step S73, the audio decoder 61 subtracts the countervalue, stored in the reference counter register 54 at that point intime, from the value held in the APTS register 58 and then writes theresultant APTS differential value onto the APTS differential register59.

Optionally, those values may be written on the APTS register 58 and APTSdifferential register 59 before the audio is output. Also, the values ofthe APTS register 58 and the APTS differential register 59 may beupdated with the APTS value of the next audio frame (in 32 ms) while theaudio is being output.

Hereinafter, the processing done by the data processor 50 will bedescribed. FIG. 8 shows the procedure of the processing done by the dataprocessor 50. First, in Step S81, the optical pickup 51 reads a datastream from an optical disk and the playback processing section 52performs digitization processing. Next, in Step S82, the AV parser 53separates a video stream, VPTS, an audio stream and APTS from the datastream 30 that has been read.

In Step S83, the video decoder 60 outputs a video frame and writes itsVPTS value on the VPTS register 56 and its VPTS differential value ontothe VPTS differential register 57, respectively. Next, in Step S84, theaudio decoder 61 outputs an audio frame and writes its APTS value on theAPTS register 58 and its APTS differential value onto the APTSdifferential register 59, respectively.

Subsequently, in Step S85, the AV synchronization control section 55compares the VPTS differential value retained in the VPTS differentialregister 57 with the APTS differential value retained in the APTSdifferential register 59.

The meaning of this comparison to be made in Step S85 will be describedwith reference to FIG. 9, which is a graph showing how the valuesretained in the VPTS register 56, VPTS differential register 57, APTSregister 58 and APTS differential register 59 change. This graph showsan example in which the video is synchronized with the audio and outputcontinuously so as to be 20 ms ahead of the audio.

In this preferred embodiment, the APTS value increases discretely every32 ms and the VPTS value also increases discretely every 16.7 ms. InFIG. 9, the times at which the APTS values and APTS differential valuesare updated are indicated by the solid circles ●, while the times atwhich the VPTS values and VPTS differential values are updated areindicated by the solid triangles ▴. The audio frames and video framesare output at these timings. In this example, the reference counter issupposed to start counting at a point in time of −50 ms at the beginningof the APTS playback for convenience sake. However, the counter initialvalue at the beginning of playback may be set to any other arbitraryvalue.

As described above, the APTS register 58 and the VPTS register 56 eachhold the previous value until the next value is input thereto.Accordingly, the values held by the APTS register 58 and VPTS register56 are plotted as steps. That is why the difference between the APTS andVPTS values changes significantly according to the timing to calculatethat difference. Consequently, it is not appropriate to checksynchronization between the audio and video based on this difference.

The audio frame and video frame are output when the APTS value or VPTSvalue either matches the reference counter value or has a predetermineddifference from it. In the example shown in FIG. 9, the audio and videoframes are output when a difference of −50 ms is made. Accordingly,under the condition of this preferred embodiment in which the referencecounter value is generated by reference to the presentation time of theaudio frame, the differential value between the APTS value and referencecounter value (i.e., the APTS differential value) is always calculatedconstant as plotted by the graph representing the APTS difference.

On the other hand, the differential value between the VPTS value and thereference counter value (i.e., the VPTS differential value) becomesconstant unless the output of the video frame is too late or too early.For example, if the video is output in −50 ms to +30 ms with respect tothe audio, then the VPTS differential value becomes substantiallyconstant as plotted by the graph representing the VPTS differentialvalue. In that case, the viewer feels that the video is synchronizedwith the audio. However, if the video frame is output too late to keepthe delay within that range, then the VPTS differential value willdecrease. Conversely, if the video frame is output too early, then theVPTS differential value will increase.

Accordingly, by comparing the APTS differential value and the VPTSdifferential value with each other, it is possible to accuratelyestimate, without depending on the timing to make that comparison, howmuch the output of the video frame is behind that of the audio frame.

Referring back to FIG. 8, if the VPTS differential value has turned outto be greater than the APTS differential value by a predetermined amountof time α (=30 ms, for example) or more as a result of the comparisonmade in Step S85, then the process advances to Step S86. Otherwise, theprocess advances to Step S87.

If the process advances to Step S86, the video frame has been output tooearly. In that case, the video decoder 60 outputs the current videoframe again in accordance with the instruction given by the AVsynchronization control section 55. As a result, a pause is set on theoutput of the video frame to let the output of the audio frame catch upwith that of the video frame, thereby synchronizing the audio and videotogether. After that, the process advances to Step S89.

Meanwhile, in Step S87, it is determined whether or not the VPTSdifferential value is smaller than the APTS differential value minus apredetermined amount of time β (=50 ms, for example). If the answer isYES, the process advances to Step S88. Otherwise, it means that the VPTSdifferential value is approximately equal to the APTS differentialvalue. Thereafter, the process advances to Step S89.

In Step S88, in accordance with the instruction given by the AVsynchronization control section 55, the video decoder 60 skips one videoframe and outputs the following video frames instead. As a result, theoutput of the video frame catches up with that of the audio frame, thussynchronizing the audio and video together.

Finally, in Step S89, the AV synchronization control section 55determines whether or not there are any video frames left. If the answeris YES, then Step S83 and following processing steps are performed allover again. Otherwise, the process ends.

Next, it will be described with reference to FIG. 10 what processing isdone if the output of the audio frame has finished earlier than that ofthe video frame.

FIG. 10 is a graph showing how the values retained in the registers 56,57, 58 and 59 will change in a situation where the output of the audioframe has finished earlier than that of the video frame. Suppose theoutput of the audio frame has finished at a time t=t1 and is nowdiscontinued. It should be noted that the value of the APTS register nolonger increases after the time t=t1 because the APTS is not updatedanymore after that. Accordingly, if one tried to synchronize the videoby reference to the value of the APTS register 58, then the output ofthe video would come to a halt, too, as indicated by the dashed lineVPTS(1).

However, if the APTS differential register 57 and the VPTS differentialregister 59 are used, the same value of the APTS differential registerwill be output continuously even after the output of the audio data hasdiscontinued. Thus, the same state can be maintained as if the audiowere still output continuously. That is why the output of the videoframe can be continued just as intended by the same processing as thatdescribed above. The values of the VPTS register 56 in such a situationare represented by VPTS(2).

Next, it will be described with reference to FIG. 11 what processingshould be done if a pause has been set on the playback of audio andvideo in response to the viewer's command, for example. FIG. 11 is agraph showing how the values retained in the registers 56, 57, 58 and 59will change if a pause has been set on the playback. Suppose a pause hasbeen put on the playback at a time t=t3. During a pause interval afterthat time t=t3, the same video frame is output over and over again.Accordingly, the same VPTS value as that output just before the pausewas set is continuously input to the VPTS register 56. Also, since theVPTS value is constant and the reference counter value increases, theVPTS differential value input to the VPTS differential register 57gradually decreases stepwise. Meanwhile, the output of the audio stops.Thus, the APTS value and the APTS differential value remain thoseupdated at the time t=t2, which is just before the time t=t3 when thepause was put. Comparing the APTS differential value and the VPTSdifferential value with each other, their difference goes on increasingduring the pause interval. As a result, although the playback isactually carried on normally, the AV synchronization control section 55erroneously senses a delay in the playback of the video frame, thuscausing a malfunction.

Thus, in this case, the AV synchronization control section 55 maycontinue to output the video frame by using the APTS value held in theAPTS register 58 and the VPTS value held in the VPTS register 56. Thatis to say, the APTS value held in the APTS register 58 and the VPTSvalue held in the VPTS register 56 keep quite the same values, and theirdifference is also constant during the pause interval. That is why byreference to this difference, the AV synchronization control section 55never senses a lag in the playback timing of the video frame. Accordingto the NTSC standard, the video frame is output every 1/29.97 second(i.e., about 1/30 second) with respect to the frame that was presentedjust before the pause setting time t=t3. It should be noted that if thevideo frame is presented by the interlacing technique, each of theprevious two fields is output every 1/59.94 second (i.e., about 1/60second). In setting these timings, no APTS values in the APTS register58 are used in particular.

When the pause is cancelled after that, the data processor 50 cansynchronize the audio and video together by performing the processingsteps shown in FIG. 12, which shows the procedure of processing to bedone by the data processor 50 when the playback pause is cancelled.First, in Step S120, the AV synchronization control section 55 receivesan instruction to cancel the pause from the viewer, for example. Next,if the VPTS value is found greater than the sum of the APTS value and apredetermined amount of time γ (=30 ms, for example) in Step S121, thenthe process advances to Step S122. Otherwise, the process advances toStep S123. In Step S122, the AV synchronization control section 55cancels the pause on the audio playback and then cancels the pause onthe video playback. By removing the pause on the audio playback first,the output of the audio frame catches up with that of the video frameand the audio and video can be synchronized with each other. Then, theprocess shown in FIG. 12 ends and advances to the remaining processshown in FIG. 8.

In Step S123, it is determined whether or not the VPTS value is smallerthan the APTS value minus a predetermined amount of time δ (=50 ms, forexample). If the answer is YES, the process advances to Step S124.Otherwise, the process advances to Step S125.

In Step S124, the AV synchronization control section 55 cancels thepause on the video playback and then cancels the pause on the audioplayback. As a result, the output of the video frame catches up withthat of the audio frame and the audio and video can be synchronized witheach other. Then, the process shown in FIG. 12 ends and advances to theremaining process shown in FIG. 8.

In Step S125, the pause on the video playback and the pause on the audioplayback are cancelled at the same time. Then, the process shown in FIG.12 ends and advances to the remaining process shown in FIG. 8.Optionally, the audio and video may also be synchronized with each otherby comparing the sum of the values held in the VPTS differentialregister and reference counter register with that held in the APTSregister.

It is difficult to identify the currently output frame just with theVPTS differential register 57 and the reference counter register 54.However, by providing the VPTS register 56, the video frame being outputcan be identified easily.

A data processor 50 according to a preferred embodiment of the presentinvention works as described above. However, the data processor 50 mayalso have a configuration such as that shown in FIG. 13, not just theconfiguration shown in FIG. 5. FIG. 13 shows an alternativeconfiguration for the data processor. In this example, the function ofthe AV synchronization control section 55 is carried out by ageneral-purpose CPU 131 that operates following a playback processingprogram 130, and all components of the data processor but the opticalpickup 51, playback processing section 52 and AV synchronization controlsection 55 are implemented as a single decoder chip 131. Theillustration of the optical pickup 51 and the playback processingsection 52 is omitted.

The playback processing program 130 is described so as to make the dataprocessor operate following the procedures shown in FIGS. 8 and 12 andis executed by the general-purpose CPU 131. If the CPU performs thefunction of the AV synchronization control section 55 by executing asoftware program, then the real-time requirement on the processing ofthe AV synchronization control section 55 can be relaxed. Morespecifically, the general-purpose CPU 131 needs to perform not only theprocessing described above but also other types of processing, includingmoving the optical pickup 51 and digitization processing at the playbackprocessing section 52, as well. The processes shown in FIGS. 8 and 12are not limited to any particular timing to compare the VPTSdifferential value and the APTS differential value but can achieveaccurate synchronization whenever performed. Thus, there are fewerscheduling limits on the operation of the general-purposed CPU 131. Thisadvantage is significant considering that the general-purpose CPU 131 ofthe conventional player 20 should update the PTS at a strict timing inorder to improve the precision of synchronization.

INDUSTRIAL APPLICABILITY

According the present invention, audio and video can be synchronizedtogether just as intended by comparing the progress of the audio andvideo outputs with each other using an APTS differential register and aVPTS differential register. In addition, even if the audio data hasdiscontinued earlier than the video data, the video can still be outputcontinuously by reference to the audio. Furthermore, by providing anAPTS register, the progress of the audio and video outputs can also becompared even during a pause interval and the audio and video can besynchronized with each other after the pause has been cancelled. As aresult, a player that can carry out playback without making the viewerfeel uncomfortable can be provided.

1. A data processor comprising: a parser, which receives a data streamand which separates, from the data stream, video data including data ofa plurality of pictures, a first time information value showing apresentation time of each of those pictures, audio data including dataof a plurality of audio frames, and a second time information valueshowing a presentation time of each of those audio frames; an outputsection for outputting the video data and the audio data to play backthe pictures and the audio frames; a reference register for countingamount of time that has passed since a predetermined reference time; afirst differential register for calculating a difference between thefirst time information value of each said picture and the amount of timepassed and holding the difference as a first differential value when thepicture is output; a second differential register for calculating adifference between the second time information value of each said audioframe and the amount of time passed and holding the difference as asecond differential value when the audio frame is output; and a controlsection for controlling output of the video data according to themagnitude of the first differential value with respect to that of thesecond differential value.
 2. The data processor of claim 1, wherein thereference register counts the amount of time passed by setting a timewhen the audio data starts to be output as the predetermined referencetime.
 3. The data processor of claim 1, wherein when finding the firstdifferential value greater than a sum of the second differential valueand a first prescribed value, the control section instructs the outputsection to output the video data of the picture being output at thatpoint in time again.
 4. The data processor of claim 1, wherein whenfinding the first differential value smaller than the difference betweenthe second differential value and a second prescribed value, the controlsection instructs the output section to skip output of the video data ofthe picture.
 5. The data processor of claim 1, wherein on receiving aninstruction to set a pause on playback, the control section controls theoutput of the video data by reference to the first and second timeinformation values.
 6. The data processor of claim 5, wherein onreceiving an instruction to cancel the pause on playback, the controlsection instructs the output section to output the audio data earlierthan the video data if the first time information value is greater thana sum of the second time information value and a third prescribed value.7. The data processor of claim 5, wherein on receiving an instruction tocancel the pause on playback, the control section instructs the outputsection to output the video data earlier than the audio data if thefirst time information value is smaller than the difference between thesecond time information value and a fourth prescribed value.
 8. The dataprocessor of claim 5, wherein on receiving an instruction to cancel thepause on playback, the control section instructs the output section tooutput the video data and the audio data simultaneously.
 9. A dataprocessing method comprising the steps of: receiving a data stream;separating, from the data stream, video data including data of aplurality of pictures, a first time information value showing apresentation time of each of those pictures, audio data including dataof a plurality of audio frames, and a second time information valueshowing a presentation time of each of those audio frames; outputtingthe video data and the audio data to play back the pictures and theaudio frames; counting amount of time that has passed since apredetermined reference time; calculating a difference between the firsttime information value of each said picture and the amount of timepassed and holding the difference as a first differential value when thepicture is output; calculating a difference between the second timeinformation value of each said audio frame and the amount of time passedand holding the difference as a second differential value when the audioframe is output; and controlling output of the video data according tothe magnitude of the first differential value with respect to that ofthe second differential value.
 10. The data processing method of claim9, wherein the step of counting includes counting the amount of timepassed by setting a time when the audio data starts to be output as thepredetermined reference time.
 11. The data processing method of claim 9,wherein if the first differential value is greater than a sum of thesecond differential value and a first prescribed value, the step ofoutputting includes outputting the video data of the picture beingoutput at that point in time again.
 12. The data processing method ofclaim 9, wherein if the first differential value is smaller than thedifference between the second differential value and a second prescribedvalue, the step of outputting includes skipping output of the video dataof the picture.
 13. The data processing method of claim 9, furthercomprising the step of receiving an instruction about a pause onplayback, wherein the step of outputting includes controlling the outputof the video data by reference to the first and second time informationvalues in accordance with the instruction.
 14. The data processingmethod of claim 13, further comprising the step of receiving aninstruction to cancel the pause on playback, wherein the step ofoutputting includes outputting the audio data earlier than the videodata in accordance with the instruction if the first time informationvalue is greater than a sum of the second time information value and athird prescribed value.
 15. The data processing method of claim 13,further comprising the step of receiving an instruction to cancel thepause on playback, wherein the step of outputting includes outputtingthe video data earlier than the audio data in accordance with theinstruction if the first time information value is smaller than thedifference between the second time information value and a fourthprescribed value.
 16. The data processing method of claim 13, furthercomprising the step of receiving an instruction to cancel the pause onplayback, wherein the step of outputting includes outputting the videodata and the audio data simultaneously in accordance with theinstruction.
 17. A product, comprising: a data processor, wherein thedata processor includes a computer executing a computer program toreceive a data stream; separate, from the data stream, video dataincluding data of a plurality of pictures, a first time informationvalue showing a presentation time of each of those pictures, audio dataincluding data of a plurality of audio frames, and a second timeinformation value showing a presentation time of each of those audioframes; output the video data and the audio data to play back thepictures and the audio frames; count an amount of time that has passedsince a predetermined reference time; calculate a difference between thefirst time information value of each said picture and the amount of timepassed and holding the difference as a first differential value when thepicture is output; calculate a difference between the second timeinformation value of each said audio frame and the amount of time passedand holding the difference as a second differential value when the audioframe is output; and control an output of the video data according tothe magnitude of the first differential value with respect to that ofthe second differential value.
 18. The product of claim 17, wherein thestep of counting includes counting the amount of time passed by settinga time when the audio data starts to be output as the predeterminedreference time.
 19. The product of claim 17, wherein if the firstdifferential value is greater than a sum of the second differentialvalue and a first prescribed value, the step of outputting includesoutputting the video data of the picture being output at that point intime again.
 20. The product of claim 17, wherein if the firstdifferential value is smaller than the difference between the seconddifferential value and a second prescribed value, the step of outputtingincludes skipping output of the video data of the picture.